Assignment2 - covid-19 and governments’ response
Hi, Dr. Andrew, this is my work of assignment 2, hope you like it~
Prepare for R markdown
Here, let’s turn off the warning and message for later chunks because during the later plotting process, there are lots of warning messages, so to make it clear and readable, I set them both as false to ignore this information. It is worth noticing that I have handled all the warnings to make sure all the processing is appropriate, and I would only mention then warning information when necessary.
Also, I use the rmdformats package so that it can organize this whole document better.
Introduction
After the COVID-19 outbreak, different countries worldwide pay various degrees of effort against this disease. However, almost all governments remain being blamed by their civilians no matter what measures they take. Until now, mainstream countries around the world started to decrease their attention on COVID-19 and it seems like this disaster is passing by. It seems like all the efforts paid by these governments are of no use, failing to stop the step of COVID in the long run. Is it true? Here, in this post-COVID time point, I hope to investigate the relationship between the governments’ response (Public Heat and Economic Measures) to COVID-19 and the influence caused by it (Case rate and Death rate) in each country, trying to identify the role of different strategies in coping with COVID. These results can help the government to determine the most efficient method used to hinder the epidemic of communicable diseases in the future.
First I load, wrangle and visualize data. At last, I conduct various statistical analyses and build a model to reveal the relationship between the responding patterns of governments and the severity of COVID (reported cases and death rate) in each country.
Meanwhile, people always think development countries with high GDP can handle this kind of worldwide pandemic better, so we introduce the GPD per capita (PPP) into our presentation and analysis, comparing the response between countries with high PPP and low PPP and the influence on the COVID-19 related deaths.
To be more specific, first, we use the median GDP per capital (ppp)/ Rigidity Public Health / Economic Measurement to divide the countries into high/low ppp/(public health/economic) response level subgroups of countries. Like the original paper used, we use the index of public health responses to COVID-19 as of 2 time-points: April 15, 2020, and September 15, 2020, The reported cases per million people as the main outcome, to evaluate the relationship between the government response and COVID-19 in the different developing level of countries.
Load packages
First, I’m going to load all the packages that should be used in the three questions, having already used the install.packages function directly in the console, e.g., install.packages("tidyverse").
Now we need to add these packages into our environment with function library()(e.g., library(tidyverse)). The main packages include:
tidyverse: a collection of R packages using widely in data science. In the dataset, we specifically use ggplot2, dplyr and purrr.ggplot: used to visualised the datadplyr: includesselect,mutate,filter,summariseandarrange.
gganimate: extends the grammar of graphics as implemented by ggplot2 to include the description of animation, drawing dynamic picture. -rworldmap: a package for visualising global data, concentrating on data refer-enced by country codes or gridded at half degree resolution.plotly: an R package for creating interactive web-based graphsggpubr:ggarrangein it can be used to arrange multiple ggplots on the same page.cowplot: a simple add-on to ggplot.rmdformats: The rmdformats package provides several HTML output formats of unique and attractive styles
library(tidyverse)
library(gganimate)
library(rworldmap)
library(plotly)
library(ggpubr)
library(cowplot)
library(rmdformats)Read in Data and Data Wrangaling
Now, we will read in the experimental data that we are going to analyses as a variable in our environment.
For this assignment, I downloaded two open-source datasets, including “GOVERNMENTS’ RESPONSES TO COVID-19 (response2covid19) - DATASET” and “Data on COVID-19 (coronavirus) by Our World in Data”. Both of the related reports and descriptions are firstly published in the journal Scientific Data and the database is from GOVERNMENTS’ RESPONSES TO COVID-19 and Data on COVID-19.
Here, first, I combine these two datasets and then visualize the data using various methods, presenting some comprehensive information to the reader.
covid_data <- read_csv("owid_covid_data.csv") %>%
select(iso_code, date, population, population_density, gdp_per_capita,
total_cases_per_million, new_cases,
total_deaths_per_million, new_deaths) %>%
rename(id = iso_code) %>%
mutate(t_date = as.Date(date))
Government_response_all <- read_csv("Gov_Responses2Covid19_2021.csv")
Government_response <- Government_response_all %>%
rename(region = country, t_date = d) %>%
select(region, t_date, id, Rigidity_Public_Health, Economic_Measures)
my_data_all <- Government_response %>%
left_join(covid_data, by = c('id','t_date')) %>%
filter(!is.na(population))
my_data_all <- left_join(Government_response, covid_data, by = c('id','t_date'))Data Visualisation
Prepare for data visulisation - data wrangling 2
Here we wrangle our data to make it suitable to draw animated plots. - rworldmap: a package for visualising global data. - joinCountryData2Map: Joins user data referenced by country codes or names to an internal map, ready for plotting. - fortify: Convert a curves and points object to a data frame for ggplot2. - merge: Merge two data frames by common columns or row names.
arrange: Arrange rows by column values.distinct(): Retain only unique/distinct rows from an input variable.left_join(): These are generic functions that dispatch to individual variable.as.Date: convert between character representations and objects of class “Date”.
my_data_gif <- my_data_all %>%
rename(iso3c = id)
map_gif <- joinCountryData2Map(my_data_gif, joinCode = "ISO3", nameJoinColumn = "iso3c")## 109410 codes from your data successfully matched countries in the map
## 1563 codes from your data failed to match with a country code in the map
## 33 codes from the map weren't represented in your data
map_gif_poly <- fortify(map_gif) #extract polygons
map_gif_poly <- merge(map_gif_poly, map_gif@data, by.x="id", by.y="ADMIN", all.x=T)
map_poly_all <- map_gif_poly %>%
arrange(id, order) %>%
select(REGION, ISO3.1) %>%
rename(id = ISO3.1, continent = REGION) %>%
distinct()
my_data_gif_all <- left_join(my_data_all, map_poly_all, by = 'id') %>%
filter(!is.na(continent) & t_date <= as.Date("2021-05-15"))Animated plots
These plots present the dynamic change of everyday reported time cases and the measurement the government took at that time. This type of plot uses our data as much as possible and is also interesting to show our data to the reader.
p_cases_PublicHealth <- my_data_gif_all%>%
ggplot(aes(x = total_cases_per_million, y = Rigidity_Public_Health,
size = population, colour = region)) +
geom_point(show.legend = FALSE, alpha = 0.7) +
facet_wrap(~continent) +
scale_color_viridis_d() +
scale_size(range = c(2, 12)) +
labs(x = "total cases (per million)", y = "Public Health Response") +
transition_time(t_date) +
labs(title = "date: {frame_time}") +
shadow_wake(wake_length = 0.1, alpha = FALSE)
p_cases_Econ <- my_data_gif_all%>%
ggplot(aes(x = total_cases_per_million, y = Economic_Measures,
size = population_density, colour = region)) +
geom_point(show.legend = FALSE, alpha = 0.7) +
facet_wrap(~continent) +
scale_color_viridis_d() +
scale_size(range = c(2, 12)) +
labs(x = "total cases (per million)", y = "Economic Response") +
transition_time(t_date) +
labs(title = "date: {frame_time}") +
shadow_wake(wake_length = 0.1, alpha = FALSE)
p_cases_PublicHealthp_cases_Econ